Cognate Mapping - A Heuristic Strategy for the Semi-Supervised Acquisition of a Spanish Lexicon from a Portuguese Seed Lexicon

نویسندگان

  • Stefan Schulz
  • Kornél G. Markó
  • Eduardo Sbrissia
  • Percy Nohama
  • Udo Hahn
چکیده

We deal with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two non-parallel monolingual corpora we determined Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We validated the emergent lexical translation hypotheses by determining the similarity of fixed-window context vectors on the basis of Portuguese and Spanish text corpora.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Acquisition of a Spanish Lexicon from a Portuguese Seed Lexicon

This paper deals with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two nonparallel monolingual corpora we determine Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We are still working on the experiments and trying to achieve a good method for validating the translation hypothe...

متن کامل

Automatic Lexicon Acquisition for a Medical Cross-Language Information Retrieval System

We present a method for the automated acquisition of a multilingual medical lexicon (for Spanish and Swedish) to be used within the framework of a medical cross-language text retrieval system. We incorporate seed lexicons and parallel corpora derived from the UMLS Metathesaurus. The seed lexicons for Spanish and Swedish are automatically generated from (previously manually constructed) Portugue...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Mental Representation of Cognates/Noncognates in Persian-Speaking EFL Learners

The purpose of this study was to investigate the mental representation of cognate and noncognate translation pairs in languages with different scripts to test the prediction of dual lexicon model (Gollan, Forster, & Frost, 1997). Two groups of Persian-speaking English language learners were tested on cognate and noncognate translation pairs in Persian-English and English-Persian directions with...

متن کامل

یک چارچوب نیمه‌نظارتی مبتنی بر لغت‌نامه وفقی خودساخت جهت تحلیل نظرات فارسی

With the appearance of Web 2.0 and 3.0, users’ contribution to WWW has created a huge amount of valuable expressed opinions. Considering the difficulty or impossibility of manually analyzing such big data, sentiment analysis, as a branch of natural language processing, has been highly considered. Despite the other (popular) languages, a limited number of research studies have been conducted in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004